home *** CD-ROM | disk | FTP | other *** search
- MIT-APurify v1.3
- ----------------
-
- MIT-syntax version (GCC).
-
- (c) by Samuel DEVULDER
- jan. 1996
-
- Samuel.Devulder@info.unicaen.fr
-
- DESCRIPTION (SHORT):
- --------------------
- This is APurify for compilers with MIT syntax asm-files. As far as I
- know only GCC uses such a syntax. So that version is indeed a version
- for the GCC compiler. If you are using an other compiler, then read
- MOT-APurify instead. In the following of that document, APurify stands
- for MIT-APurify, and I assume you're using the GCC compiler.
-
- APurify is a program that allows you to detect bad accesses to memory
- of your programs without any kind of specific external devices (MMU).
- It avoids bugs due to accessing memory not owned by your program.
-
- INSTALLATION:
- ------------
- That archive contains the version of APurify for the GCC compiler
- as well for other compilers. Here is a description of gcc-related files
- of this archive for that version. It also gives you what to do with
- those files to make an installation.
-
- - doc/MIT-APurify.doc The file you are currently reading. Put it
- with all your doc files. It is usefull from
- time to time.
-
- - doc/History The whole history. (this file is not very
- usefull for common people). Do whatever you
- want with it.
-
- - bin/MIT-APurify The parser tuned for the MIT syntax. Rename
- it as APurify and put it someware in your
- path.
-
- - lib/APur-gcc.a The link-time library. Rename it as APur.a
- and put it someware in your library
- search-path.
-
- - test/test.c Source of a stupid test file. Just here to
- let you remake the test program. Do
- whatever you want with it.
-
- - test/test.gcc Test file Apurify'ed. Run it to see how
- APurify is useful :-).
-
-
- SYNOPSIS:
- --------
- Usage: APurify [-revinfo] <inputfile> [options]
-
- Where options can be:
-
- ? To display this usage
- -h To display this usage
- -? To display this usage
- -tb To test memory referenced through base register
- -ts To test memory referenced through stack register
- -tl To test memory referenced through local stack frame
- -tp To test pea instructions
- -o arg Specifies output file (def=%s)
- -br arg Sets the base register (def=A4)
- -mp arg Sets the main entry-point (def=_main)
-
- Options can be anywhere on the command line. NOTE: They can nomore be
- merged together, they must be separated by a space. You can pre-define
- them with the environment variable AP_MITP_OPT. For example, if you do:
-
- CLI> SetEnv AP_MITP_OPT "-tb -br A5"
-
- Then, when "-tb -br A5" will automatically be added to the command
- line. The space between an option and its argument can be ommited. Thus
- "-br A4" is the same as "-brA4". Here is a description of arguments and
- flags:
-
- -revinfo This displays informations about APurify (name, size and
- date of modules and number of compilation done for that
- version).
-
- -br arg This sets the base register used to reference memory
- in SMALL_DATA model. Usually A4 is used for that perpose
- and that's the default. If A5 is used instead then add
- -brA5 on your command line.
-
- -tb This enable APurify to check all referenced memory through
- the base register (see -br). If you are using a SMALL_DATA
- model, add this flag on your command line. By default,
- APurify won't check memory referenced through the base
- register.
-
- NOTE: for safest check, you should always use that option,
- even if you're not in smalldata model (A4 may be used as
- a temporary register in that case). To allow this, you can
- use the environment variable.
-
- -ts This enable APurify to check memory referenced by stack
- pointer (SP or A7). By default APurify won't check such
- memory accesses (to reduce the code size and increase the
- runtime speed). That option will detect when you have no
- more room on your stack (stack overflow).
-
- -tl This enable APurify to check memory referenced by local
- stack pointer (the one that is link'ed and unlink'ed when
- enterring and exiting a C-function). By default, this is
- switch off. This option allow APurify to detect stack
- overflow.
-
- -tp This enable APurify to check indirect adresses pushed onto
- the stack by using a pea. By default this is off. When
- used, that option will check things like "pea a2@(10)" or
- the like. This can help you with memory accessed by a
- pointer in a code that has not been APurify'ed. For example
- this is usefull for things like fread(&ptr[10],10,1,fp)
- because in that case the "pea a2@(10)" used to push on the
- stack &ptr[10] will be checked and if ptr[10] is not owned
- by your program, you'll get an APurify error. Please note
- that this may no work all the time since &ptr[0] can be
- translated as "movel a0,sp@-" which won't be checked.
-
- -o arg This specifies the name of the outputfile. If ommited the
- outputfile will be the same as the inputfile (source file).
- The name of the output file can be defined by a real name
- or a pattern. A pattern is a string where special sequences
- of characters (called specifier) are replaced by special
- strings. Let's suppose that inputfile is equal to
-
- drive:path/file.ext
-
- Here is a description of specifiers:
-
- %s will be replaced by the full source name:
-
- drive:path/file.ext
-
- %S will expand to the full source name without the
- extension:
-
- drive:path/file
-
- %b stands for the full basename:
-
- file.ext
-
- %B is a shortcut for the full basename without the
- extension:
-
- file
-
- %p is the path (ending "/" or ":" is included):
-
- drive:path/
-
- %e is the extension ("." is ommited):
-
- ext
-
- Thus, if you put "-o ram:%B-apurify.%e" in the commandline,
- then the outputfile will be "ram:file-apurify.ext" with
- our example.
-
- -mp arg This tells APurify which label should be considered as the
- entry-point. By default it is set to "_main", and it should
- not be modified.
-
- -?
- -h
- ? Obvious options.
-
-
- DESCRIPTION (A BIT LONGER):
- --------------------------
- As a general rule, at the microprocessor level, there is two kind
- of ways to access memory. There is direct access and indirect access to
- memory. For example, in C, direct access can be viewed as accessing to
- global variables. Indirect access corresponds to accessing an array
- value. More precisely, direct access corresponds to reading or writing
- a variable whose address is known at compilation time (or since the
- loading of the program into the memory). Indirect access is used for
- variables whose adress is dynamicaly determined by the program. For
- example, if p is a pointer to an array allocated by malloc(), *p is an
- indirect access. Such an access occur also in case of instruction like
- T[i] where T is a global array, because the address of T[i] is not
- known at compilation time, since it depends on the index value i. Using
- indirect access to memory is called indirection.
-
- A regular program must not access memory not owned by it. That kind
- of access can be qualified as illegal.
-
- Illegal direct access to memory is not possible, because by
- definition, only global variables can be accessed that way and those
- variables belongs obviously to the program (except for code written in
- assembly language that references absolute values, for example:
- "btst #6,$bfe001"; but that kind of code is not a good programming
- :-)). So we can assume that direct access to memory is always right.
-
- On the other hand, it is sure that indirect access to memory can
- be illegal. Many bugs are made by overstepping array boundaries. If
- that oversteppings are in reading a value, there is not much trouble
- for over running tasks (it is an error inside your task); but if it is
- in writing you may directly interfere with other tasks and big mess can
- happen (total breakdown of the system).
-
- APurify works on that kind of access by verifying the validity of
- indirect access to memory. It remebers the memory that was allocated by
- the program and check the integrity of each access. One can think that
- makes a lot of tests ! Well, yes, but APurify is not designed to be
- used in the general use of programs; just in test phases. Moreover,
- indirections do no occur very often actually. Only array-based
- variables produces indirections. Thus, the variables on the stack
- --although being accessed by indirection-- are not checked because
- their access is always safe (at least if there is no stack overflow !).
- Also, in SMALL_DATA model, global variables access is done through
- indirection, but they are not checked.
-
- If an illegal access is found, APurify displays an error message on
- the error stream of the program by default. There is two kind of
- illegal accesses. Some are accesses to memory that doesn't belong to
- the program (it is called an access between blocks), some others are
- accesses to a part of memory owned by a program and an other part not
- owned by it (it is an overstepping of a block). You can see this
- visually: If [ 1 ] and [ 2 ] represent two blocks allocated by the
- program and ( 3 ) the memory accessed, then
-
- ---- [ 1 ] ---- ( 3 ) ---- [ 2 ] ---->
- 0 increasing address
-
- corresponds to the first kind of illegal access and
-
- ---- [ 1 ( ] 3 ) ---- [ 2 ] ----->
- or
- ---- [ 1 ] ---- ( 3 [ ) 2 ] ----->
-
- corresonds to the second kind of access. The first kind is very common
- but the second is quite rare (it's rather a misaligment problem).
-
- APurify has two output modes. One is verbose an tries to give lot
- of informations by using words. The other one is more brief and gives
- you the same informations but you'll have to decode them.
-
- When APurify starts and ends, it outputs the date/time. This is
- useful if you are using logfiles. With that, you can keep all your logs
- in a single file and retrieve any execution with it's date of
- execution.
-
- In case of an error, APurify displays some text. The first line
- looks like this one:
-
- **** APURIFY ERROR ! [$<N1>(<N2>) <ATTR> (<TEXT1>)] <TEXT2>:
-
- That line represent the accessed memory. <N1> is the hexadecimal
- address accessed. <N2> is the length of the access (in decimal). <ATTR>
- represents the type of acess. <TEXT1> allows you to find where in your
- code the illegal accessed had happened. <TEXT2> describe the kind of
- illegal access.
-
- If the length (<N1>) is 1, then it was a byte access. 2 stands for
- a short access, 4 for a int/long and >4 for movem instruction.
- Attributes, <ATTR>, can be "R--" or "-W-". The first one represents an
- access in reading a value and the second an access in writing a value.
-
- The text <TEXT1> look like this:
-
- <NAME>, PC=$<PC#> HUNK=$<HUNK#> OFFSET=$<OFF#>
-
- <NAME> is the name of the subroutine where the error occured. It is
- always displayed (even if it is a "static" one). The rest of the line
- can be partially displayed, showing as much informations as APurify can
- get. <PC#> is a hexadecimal address pointing to the instruction that
- produced the error. <HUNK#> and <OFF#> are the hunk number and the
- relative offset of <PC#>. Using <HUNK#> and <OFF#> and a disassembler,
- you can very easilly find where your code is bad (BTW, I use dobj from
- netdcc, (c) by Matt Dillon). Please note that in this new version,
- <PC#> will nomore point to some instruction before the faultly one. It
- is always the real faultly adress.
-
- The remaining lines show the context of the illegal access. It
- gives you informations about the surronding memory blocks owned by
- your program. Each block is displayed according to the following
- pattern:
-
- [$<N1>(<N2>) <ATTR> (<TEXT>)]
-
- where <N1> is the hexadecimal address of the beginning of the block,
- <N2> its length (in decimal). Note that the length may seem to be
- longer than the one allocated by malloc() and the address may point
- before the one you obtained via malloc(). This is not wrong ! In fact
- you must know that the malloc() subroutine may add some informations
- (like an double-chained list or the length of the allocation) to the
- block you've requested. Those extra informations are put before the
- address you recieve. That explain this behavior. In this version of
- APur.lib, this takes 12 ($C) extra bytes. So if you allocate 10 bytes,
- don't be suprised if APurify thinks you've requested 22 bytes.
-
- <ATTR> are 3 status characters RWS
-
- where R means: read-enable block
- W means: write-enable block
- S means: system block (block not controlled by the program).
-
- If one access is forbidden, the letter '-' replaces the corresponding
- character. <TEXT> is actually the name of the procedure that has
- allocated the block.
-
- With each block you can find an offset. That offset is the distance
- between that block and the faultly address. In verbose mode, you can
- see some text explaining things about the relative position of a block
- and the accessed memory. In non-verbose mode you can just see the
- offsets followed by the blocks. The shorter offset is displayed first
- since that block is the one that is more likely overstepped.
-
- When an illegal writing occur (the only dangerous thing you can do
- by indirection, indeed), a requester opens to tell you about that. With
- that requester, you can stop your program to prevent the deadly error
- to really happen. If you wish so, exit() is called. You can also
- ignore that error or ignore all such errors (but then you'll surely
- meet the guru !).
-
- APurify checks the memory allocated but not freed by the program.
- (in fact, it detects non deallocated-blocks on library-closing time).
-
- It knows about memory location independant of the program
- execution. That is to say, the first kilobyte of memory that contains
- interrupt vectors of the 680x0 processor, the program segments and the
- stack. Accessing to those blocks will be illegal. They got the S
- attribute (for SYSTEM blocks).
-
- It takes into account memory block allocated by malloc() and
- AllocMem(), and indirect allocated block (by OpenScreen() for example).
- But I did not test the last kind of allocation. Anyway, it should be
- ok, because APurify patches AllocMem() & FreeMem() entries. Thus a
- program can access to the bitplanes of one of its screen without error.
-
- If the program makes a legal access, but attributes are
- incompatible with the access-kind, a protection-error message is
- displayed. Actually only the first kilobyte is read/write-protected.
- But it may change in the future.
-
-
- HOW TO USE APURIFY:
- ------------------
- One can see APurify as a pre-assembler. It must be used on assembly
- language sourcefile just before the assembler takes place. It scan the
- file and change it a bit so that APur.a can be used.
-
- Normal way to use it for a C program is to:
-
- - compile C sourcefiles and leave assembly language source (.s).
- - use APurify on each .s file.
- - compile your .s file to get a .o file
- - link all .o files together with APur.a.
-
- For example, using gcc on prog.c it gives
-
- CLI> gcc -g prog.c -o prog.s -S
- CLI> APurify -tb prog.s
- CLI> gcc -g prog.s -o prog -lAPur
-
- As you can see, APurify needs no change to your C files to be used.
- In this realease you need no more to call AP_Init() in the main()
- function. The call is automatically inserted when the main-entry label
- (specified by -mp) is found. You shoud not use dos.library/Exit() to
- abort your program, I think it'll crash if APurify is running. If you
- must use Exit() then call AP_Close() just before calling Exit(). The
- explantion is simple: since some system functions are patched, if a
- program exits without closing the library, those patch will be
- corruped, pointing to a code that is nomore in memory and you'll meet
- the guru (ie: the computer will crash)... (You've been warned :-).
-
- You can disable/enable printing of messages by making a call to
- AP_Report(flag). If flag is true (ie. different from zero) then
- printing is enabled, if it is false (ie. equal to zero), no output will
- be done. This is usefull for startup-codes. For example, if you are
- using the argv[] array in C, APurify will make a lot of false-error
- printing. This is because the values pointed by this array is allocated
- before the library is opened. You can avoid this by calling
- AP_Report(0) before (and AP_Report(1) after) the code that uses argv[].
-
- When debugging an APurify'ed program, you can put a breakpoint on
- a function called AP_Err(). That function AP_Err() is called each time
- APurify detects an error. With that, you'll have the occasion to look
- at your program just before a faultly memory-access occur.
-
- You can switch from a verbose output to a shorter one with
- AP_Verbose(flag). IF flag is true then the verbose mode is on. If it is
- false then only short messages will be printed. Some people prefer the
- later so that is the default. If you perfer the verbose ouput then put
- AP_Verbose(1) someware in your code and you'll get some longer
- explanations about illegal accesses.
-
- You can specify a logfile where APurify can put its errors. To do
- this, set the environment variable "APlog" (file ENV:APlog) to a name
- of a logfile. If this variable is set, then APurify will append all its
- outputs to the file indicated. If this variable does not exists, then
- the standard error stream is used.
-
-
- EXAMPLE:
- -------
- As an example, let's look at the test program compiled with
- gcc-2.6.0. You'll see how you can use the APurify report it produces to
- find what's wrong in the program. For this, I've included in that
- document the commented report. My comments/explanations appear on lines
- beginning with a "#".
-
- **** APurify started on Thu Jan 4 23:03:58 1996
-
- #
- # Well, the report started...
- #
-
- **** APURIFY ERROR ! [$0026defc(4) R-- (_main, PC=$0027eef0 HUNK=$0
- OFFSET=$410)] accessed between:
- -25 [$0026df18(27) RW- (_main)]
- +1405 [$0026d920(96) RWS (segment Module CLI)]
-
- #
- # Hum... First hit... it is an error in reading something in the main()
- # procedure between two blocks already allocated. The nearest block
- # appears in first position, so we can think that the error was done by
- # accessing an array allocated in main() with a negative index. We can
- # look at the code to find what is wrong with it. Using DOBJ, we found
- # at offset $410 in the first hunk the following code:
- #
- # 00.00000410 24ab ffd8 MOVE.L -40(A3),(A2)
- #
- # This corresponds to the C code:
- #
- # a[0]=b[-10]
- #
- # Hence we've discovered a first error in the code. Note that -25 is
- # the distance (in bytes) between the end of the accessed memory and
- # the beginning of the array. This is not the difference between the
- # beginning address of the two blocks!
- #
-
- **** APURIFY ERROR ! [$00245f20(4) R-- (_main, PC=$0027ef1a HUNK=$0
- OFFSET=$43a)] accessed between:
- +1 [$00245f10(16) RW- (_main)]
- -162301 [$0026d920(96) RWS (segment Module CLI)]
-
- #
- # Well... here it seems to be an access just after an allocated block.
- # the offset +1 is the distance in bytes between the accessed block and
- # a allocated block. The situation is like this:
- #
- # ---------[ 1 ]( 2 )---------->
- #
- # Where "[ 1 ]" is the allocated block and "( 2 )" the accessed block.
- # If we look in the code, we find:
- #
- # 00.0000043a 4aaa 0004 TST.L 4(A2)
- #
- # that correponds to the test done by "if(a[1] == 0)". This is an error
- # since the array 'a' is just 16-12=4 bytes long. So a[1] points out of
- # the array!
- #
-
- **** APURIFY ERROR ! [$00245f1e(4) R-- (_read_shifted, PC=$0027ed9e
- HUNK=$0 OFFSET=$2be)] accessed across the ending boundary of:
- -2 [$00245f10(16) RW- (_main)]
-
- #
- # Hehe another error... Damn ! That test program is a FULL of bug !
- # Yes, but that one is an other kind of error. It is an access across a
- # boundary. That occur in the read_shifted() code. We need not look in
- # the asm file to see the error. Here it is a misaligment error.
- # Visually that gives:
- #
- # ------------[ 1(]2 )----------->
- #
- # [ 1 ] = allocated ( 2 ) = accessed.
- #
-
- **** APURIFY ERROR ! [$00245f1c(4) R-- (_read_long, PC=$0027edce
- HUNK=$0 OFFSET=$2ee)] accessed between:
- -162305 [$0026d920(96) RWS (segment Module CLI)]
- +2382621 [$00000000(1024) --S (Basic 680x0 vectors)]
-
- #
- # That error is strange! It is not an access to an array with a
- # negative index as one think immediately: We never call read_long() in
- # such a way, and the offsets are too big ! Indeed, the accessed memory
- # was right some times ago since it lays in the array 'a' (look at the
- # second hit). Hence, it must be an access to a free()'d memory. That
- # error is then obviously found in the code:
- #
- # free_arg(a); read_long(a).
- # ^^^^^^^^^^^^
- #
-
- **** APURIFY ERROR ! [$00000004(4) R-- (_read_page_zero, PC=$0027ee32
- HUNK=$0 OFFSET=$352)] accessed on a read-protected block:
- +4 [$00000000(1024) --S (Basic 680x0 vectors)]
-
- #
- # Here the error is obvious, were are reading the zero-page. If it was
- # in writing, that error would be very dangerous.
- #
-
- **** APURIFY WARNING ! Closing library without deallocation of the
- following block(s):
- - [$00271540(412) RW- (_main)]
- - [$00287070(12012) RW- (_main)]
- - [$0032e2c0(40012) RW- (_main)]
-
- #
- # The program has exit()ed. APurify tells us that we've forget to free
- # those blocks. It is a case of memory leak. Those blocks were
- # allocated in main(). Those were allocated and lost by
- #
- # a=malloc(4),malloc(400),malloc(12000),malloc(400000)
- #
- # since the assignment only affects the first item of ",,,".
- #
-
- **** APurify ended on Thu Jan 4 23:04:00 1996
-
- #
- # Well... done :-).
- #
-
-
- LEGAL PART:
- ----------
- That program is provided 'AS IS'. I am not responsible for any
- dammage it can cause (but I am responsible for the benefits it can give
- to you :-). Use that software at you own risks.
-
- That program is FREEWARE. You can use and distribute it as long as
- you keep the archive intact (no adulteration of files except for
- compression). It can't be sold without my agreement (except a minimal
- amount for media support). You must ask me for commercial use of (any
- part of) that product. I keep all my rights on that program and its
- future releases. I can modify that software without telling it to the
- users.
-
- If you wish, you can send me a postcard or anything else you want
- (money, documentation, amiga, hardware stuff, ...) in exchange for
- using APurify. But there is no obligation :-). My postal address is:
-
- M. DEVULDER Samuel
- 1, Rue du chateau
- 59380 STEENE
- FRANCE
-
- (yes I'm french !). You can send suggestions or bugs to my email
- address:
-
- devulder@info.unicaen.fr
-
-
- NOTES:
- -----
- It has been compiled with cross-gcc 2.7.0 with libnix on a Sun
- sparc.
-
- I had the idea of that program after a chat with Cedric BEUST
- (AMIGA NEWS) on IRC (Internet Relay Chat). Thanks Cedric !
-
- I wish to thank Philippe Brand for his help in my port. I also wish
- to thank J.C Hoehle for his usefull advices.
-
- All marks are proprietary of their respective owners.
-
- There are some programs like APurify. For example, FORTIFY (Simon
- P. Bullen), but it only detects illegal writes to boundaries of
- allocated blocks. Thus it can't detect big oversteps and oversteps in
- reading and the detection is not real-time. Enforcer can detect illegal
- access to memory, but it needs a special device (MMU).
-
-
- HINTS & TIPS:
- ------------
- You can see some memory leaks with that version of APurify. It is
- not really good but it can help. Memory leak occur when a block of
- memory is nomore pointed by your program. Those memory blocks will
- necessary be displayed when your program exit()s. So with all the
- messages printed on that occasion, you can find such blocks. I known
- this is not so great, but I think it can help you a little bit (maybe
- in a future version I'll build some code to really check memory leaks).
-
-
- BUGS:
- ----
- APurify don't known public memory where a program can read or write
- without having allocated it. Thus, it will report an error when a
- program reads or writes values in a message obtained through GetMsg()
- calls. Use AP_Report() to avoid such reports.
-
- It can display messages about closing the library without freeing
- some memory blocks. This is due to printf() that allocates memory that
- is free'd on exit. This is not a real bug, but you can avoid this by
- doing a AP_Report(0) just before exiting. But you must notice that it
- is better to display false bugs than to not display real ones.
-
- I've rewritten malloc()/realloc()/free(). I hope this will not
- produce bugs (I've tested sucessfully the test program with libnix and
- ixemul, so I hope it will be all right).
-
- Certainly more bugs, but I'm waiting for your bug-reports.
-